Skip to content

Concurrency controls and flow limits#1421

Draft
NathanColosimo wants to merge 37 commits intovercel:mainfrom
NathanColosimo:codex/flow-limits-types-first-pass
Draft

Concurrency controls and flow limits#1421
NathanColosimo wants to merge 37 commits intovercel:mainfrom
NathanColosimo:codex/flow-limits-types-first-pass

Conversation

@NathanColosimo
Copy link
Copy Markdown
Contributor

@NathanColosimo NathanColosimo commented Mar 17, 2026

Context: Issue #1206 and discussion #301

Current status:

  • stubbed out the interfaces, construction, basic tests
  • Built initial world-local implementation
  • Built world-postgres implementation
  • have e2e basic nextjs stable + turbopack running
  • have temporary FLOW_LIMITS.md file explaining stuff

Todo:

  • clean up types so that they are shared if possible
  • review the pg db schema
  • try and share a single test contract to reduce test duplication & so all limit implementations follow the same spec
  • Right now, if there are existing leases / waiters for a key, you can't redefine the limit definition (rate limits / concurrency limit). Should this also error on just acquiring lock with a key name and no definition or should undefined definitions error only if there doesn't already exist a key with an existing definition?

Not planning to implement in this PR:

  • what to do about extending leases on infinite workflows (we'll auto heartbeat / add leaseTTL on sleep or delayed suspension the same amount?)
  • allowing manual .heartbeat in workflow code, as that needs to be made replayable
  • ring buffer rate limiting, right now I just have sliding window
  • Automated cleanup? we aren't cleaning up workflow steps table and that has significantly more rows / bloat so I don't think its a huge issue right now?

@NathanColosimo NathanColosimo requested a review from a team as a code owner March 17, 2026 19:33
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 17, 2026

🦋 Changeset detected

Latest commit: c1be937

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 24 packages
Name Type
@workflow/swc-playground-wasm Patch
@workflow/swc-plugin Patch
@workflow/world-postgres Patch
@workflow/world-testing Patch
@workflow/world-vercel Patch
@workflow/world-local Patch
@workflow/web-shared Patch
@workflow/sveltekit Patch
@workflow/builders Patch
workflow Patch
@workflow/errors Patch
@workflow/rollup Patch
@workflow/vitest Patch
@workflow/astro Patch
@workflow/nitro Patch
@workflow/world Patch
@workflow/core Patch
@workflow/nest Patch
@workflow/next Patch
@workflow/nuxt Patch
@workflow/vite Patch
@workflow/cli Patch
@workflow/web Patch
@workflow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 17, 2026

@NathanColosimo is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

@NathanColosimo NathanColosimo marked this pull request as draft March 17, 2026 19:33
@NathanColosimo
Copy link
Copy Markdown
Contributor Author

@VaguelySerious @pranaygp

Read:
workbench/example/workflows/99_e2e.ts
packages/world/FLOW_LIMITS.md

what do you think?

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
I, nathancolosimo <nathancolosimo@gmail.com>, hereby add my Signed-off-by to this commit: b0e2f2a

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
@NathanColosimo NathanColosimo force-pushed the codex/flow-limits-types-first-pass branch from e0ac4eb to b761df2 Compare March 18, 2026 21:22
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
@NathanColosimo NathanColosimo force-pushed the codex/flow-limits-types-first-pass branch from b761df2 to 1677f3d Compare March 18, 2026 21:52
I, nathancolosimo <nathancolosimo@gmail.com>, hereby add my Signed-off-by to this commit: 4b918ca

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
@NathanColosimo
Copy link
Copy Markdown
Contributor Author

NathanColosimo commented Mar 18, 2026

I have await using lease = lock() working for both workflow level and step level

workflow level will block at that time and only re enqueue once lock can be acquired in FIFO order. and doesn't take up a concurrency slot for infra-level env vars like WORLD LOCAL MAX CONCURRENCY or whatever - which makes sense.

step level however:
if the lock is called as the first line or main thing in the step, it makes sense to abort the step (don't count it as an error towards max retries), and just come back once lock can be acquired - preAcquire the lock before resuming (so we don't spin up a step just to fail acquire) and pass the lock down to the step via the shared storage interface

what do we do if await using lease = lock() is called in the middle of a step?

  • we could either wait in process (taking up concurrency slot)
  • or we could wait in queue (which doesnt take up concurrency slot, but will re-rerun all the step code up until that point, so there will be extra side effects unless pre-lock code is idempotent) ? (right now I think this is best, I would rather not take up a concurrency slot)

@VaguelySerious @pranaygp

@VaguelySerious
Copy link
Copy Markdown
Member

@NathanColosimo We spent some time today discussing this and I'll get back to you with some ideas and thoughts tomorrow

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
@VaguelySerious
Copy link
Copy Markdown
Member

VaguelySerious commented Mar 19, 2026

@NathanColosimo thank you for your patience. I just posted a WIP implementation spec for locks in the RFC discussion here. Curious to hear what you think.

This does not block your PR directly. This PR can still ship and add lock for local/postgres worlds, under a few assumptions:

  • The user-land code/DX stays the same (already true)
  • We get rid of the step_deferred event, since it's missing forward-compat. This can be resolved by, instead of storing an event, re-use step_retrying, but adding an attribute step_deferred to the queue message that calls the step, so that the step runtime knows to avoid increasing the retry counter.
  • We validate that as long as no lock statement is encountered, the added code in this PR does trigger for world-vercel, with the exception of checking for the lock statements (should be simple)
  • We ensure forward-compat when upgrading world-postgres to a spec similar to the proposed spec, not requiring active concurrency/rate-limits to be maintained (though could potentially write a migration to do so)

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
@NathanColosimo NathanColosimo force-pushed the codex/flow-limits-types-first-pass branch from b4179d3 to a22fd76 Compare April 1, 2026 02:41
…vercel backend

Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Signed-off-by: nathancolosimo <nathancolosimo@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants